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Description 

FIELD OF THE INVENTION 

[0001] The present invention pertains to the field of 
client-server computer networking. More particularly, 
the present invention relates to a method and apparatus 
for providing proxying and document transcoding in a 
server in a computer network. 

BACKGROUND OF THE INVENTION 

[0002] The number of people using personal comput- 
ers has increased substantially in recent years, and 
along with this increase has come an explosion in the 
use of the Internet. One particular aspect of the Internet 
which has gained widespread use is the World-Wide 
Web ("the Web"). The Web is a collection of formatted 
hypertext pages located on numerous computers 
around the world that are logically connected by the In- 
ternet. Advances in network technology and software 
providing user interfaces to the Web ("Web browsers") 
have made the Web accessible to a large segment of 
the population. However, despite the growth in the de- 
velopment and use of the Web, many people are still 
unable to take advantage of this important resource. 
[0003] Access to the Web has been limited thus far 
mostly to people who have access to a personal com- 
puter. However, many people cannot afford the cost of 
even a relatively inexpensive personal computer, while 
others are either unable or unwilling to learn the basic 
computer skills that are required to access the Web. Fur- 
thermore, Web browsers in the prior art generally do not 
provide the degree of user-friendliness desired by some 
people, and many computer novices do not have the pa- 
tience to learn how to use the software. Therefore, it 
would be desirable to provide an inexpensive means by 
which a person can access the Web without the use of 
a personal computer. In particular, it would be desirable 
for a person to be able to access the Web pages using 
an ordinary television set and a remote control, so that 
the person feels more as if he or she is simply changing 
television channels, ratherthan utilizing a complex com- 
puter network. 

[0004] Prior art Web technology also has other signif- 
icant limitations which can make a person's experience 
unpleasant when browsing the Web. Web documents 
are commonly written in HTML (Hypertext Mark-up Lan- 
guage). HTML documents sometimes contain bugs (er- 
rors) or have features that are not recognized by certain 
Web browsers. These bugs or quirks in a document can 
cause a Web browser to fail. Thus, what is needed is a 
means for reducing the frequency with which client sys- 
tems fail due to bugs or quirks in HTML documents. 
[0005] The prior art, represented by the document 
"Application-specific Proxy Servers as HTTP Stream 
Transducers" (C.Brooks et al, 4 th . International WWW 
Conference, Boston, 12-1995), is not sufficiently pre- 



venting client systems to fail due to bugs in HTML doc- 
uments, partly because it does not use a database of 
information about possible error conditions. 

5 SUMMARY OF THE INVENTION 

[0006] A method is described of providing proxy serv- 
ices to a client for accessing a document stored on a 
remote server as set forth in the appended claims. 
10 [0007] Other features of the present invention will be 
apparent from the accompanying drawings and from the 
detailed description which follows. 

BRIEF DESCRIPTION OF THE DRAWINGS 

15 

[0008] The present invention is illustrated by way of 
example and not limitation in the figures of the accom- 
panying drawings, in which like references indicate sim- 
ilar elements and in which: 

20 

Figure 1 illustrates several clients connected to a 

proxying server in a network. 

Figure 2 illustrates a client according to the present 

invention. 

25 Figure 3 is a block diagram of a server according to 
the present invention. 

Figure 4A illustrates a server including a proxy 
cache and a transcoder. 

Figure 4B illustrates databases used in a server ac- 

30 cording to the present invention. 

Figure 5 is a flow diagram illustrating a routine for 
transcoding a document retrieved from a remote 
server using data stored in a persistent database. 
Figure 6 is a flow diagram illustrating a routine for 

35 transcoding an HTML document for purposes of 
eliminating bugs or undesirable features. 
Figure 7 is a flow diagram illustrating a routine for 
reducing latency when downloading a document 
referencing an image to a client. 

40 Figure 8 is a flow diagram illustrating a routine for 
updating documents stored in the proxy cache us- 
ing data stored in a persistent database. 
Figure 9 is a flow diagram illustrating a routine used 
by a server for retrieving documents from another 

45 remote server. 

Figure 10 is a block diagram of a prior art server 
system showing a relationship between various 
services and a database. 

Figure 11 is a block diagram of a server system 
50 showing a relationship between various services 
and a user database. 

Figure 12 is a flow diagram illustrating a routine 
used by a server for regulating access to various 
services provided by the server. 

55 

DETAILED DESCRIPTION 

[0009] A method and apparatus are described for pro- 
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viding proxying and transcoding of documents in a net- 
work. In the following description, for purposes of expla- 
nation, numerous specific details are set forth in order 
to provide a thorough understanding of the present in- 
vention. It will be evident, however, to one skilled in the 
art that the present invention may be practiced without 
these specific details. In other instances, well-known 
structures and devices are shown in block diagram form 
in order to avoid unnecessarily obscuring the present 
invention. 

[0010] The present invention includes various steps, 
which will be described below. The steps can be embod- 
ied in machine-executable instructions, which can be 
used to cause a general-purpose or special-purpose 
processor programmed with the instructions to perform 
the steps. Alternatively, the steps of the present inven- 
tion might be performed by specific hardware compo- 
nents that contain hardwired logic for performing the 
steps, or by any combination of programmed computer 
components and custom hardware components. 

I. System Overview 

[001 1] The present invention is included in a system, 
known as WebTV™, for providing a user with access to 
the Internet. A user of a WebTV™ client generally ac- 
cesses a WebTV™ server via a direct-dial telephone 
(POTS, for "plain old telephone service"), ISDN (Inte- 
grated Services Digital Network), or other similar con- 
nection, in order to browse the Web, send and receive 
electronic mail (e-mail), and use various other WebTV™ 
network services. The WebTV™ network services are 
provided by WebTV™ servers using software residing 
within the WebTV™ servers in conjunction with software 
residing within a WebTV™ client. 
[001 2] Figure 1 illustrates a basic configuration of the 
WebTV™ network according to one embodiment. A 
number of WebTV™ clients 1 are coupled to a modem 
pool 2 via direct-dial, bi-directional data connections 29, 
which may be telephone (POTS, i.e., "plain old tele- 
phone service"), ISDN (Integrated Services Digital Net- 
work), or any other simitar type of connection. The mo- 
dem pool 2 is coupled typically through a router, such 
as that conventionally known in the art, to a number of 
remote servers 4 via a conventional network infrastruc- 
ture 3, such as the Internet. The WebTV™ system also 
includes a WebTV™ server 5, which specifically sup- 
ports the WebTV™ clients 1. The WebTV™ clients 1 
each have a connection to the WebTV™ server 5 either 
directly or through the modem pool 2 and the Internet 3. 
Note that the modem pool 2 is a conventional modem 
pool, such as those found today throughout the world 
providing access to the Internet and private networks. 
[0013] Note that in this description, in order to facili- 
tate explanation the WebTV™ server 5 is generally dis- 
cussed as if it were a single device, and functions pro- 
vided by the WebTV™ services are generally discussed 
as being performed by such single device. However, the 



WebTV™ server 5 may actually comprise multiple phys- 
ical and logical devices connected in a distributed archi- 
tecture, and the various functions discussed below 
which are provided by the WebTV™ services may ac- 
5 tually be distributed among multiple WebTV™ server 
devices. 

II. Client System 

10 [0014] Figure 2 illustrates a WebTV™ client 1. The 
WebTV™ client 1 includes an electronics unit 10 (here- 
inafter referred to as "the WebTV™ box 10"), an ordinary 
television set 12, and a remote control 11. In an alterna- 
tive embodiment of the present invention, the WebTV™ 
15 box 10 is built into the television set 12 as an integral 
unit The WebTV™ box 10 includes hardware and soft- 
ware for providing the user with a graphical user inter- 
face, by which the user can access the WebTV™ net- 
work services, browse the Web, send e-mail, and oth- 
20 erwise access the Internet. 

[0015] The WebTV™ client 1 uses the television set 
1 2 as a display device. The WebTV™ box 1 0 is coupled 
to the television set 12 by a video link 6. The video link 
6 is an RF (radio frequency), S-video, composite video, 
25 or other equivalent form of video link. In the preferred 
embodiment, the client 1 includes both a standard mo- 
dem and an ISDN modem, such that the communication 
link 29 between the WebTV™ box 10 and the server 5 
can be either a telephone (POTS) connection 29a or an 
30 ISDN connection 29b. The WebTV™ box 10 receives 
power through a power line 7. 
[0016] Remote control 11 is operated by the user in 
order to control the WebTV™ client 1 in browsing the 
Web, sending e-mail, and performing other Internet-re- 
35 lated functions. The WebTV™ box 10 receives com- 
mands from remote control 11 via an infrared (IR) com- 
munication link. In alternative embodiments, the link be- 
tween the remote control 11 and the WebTV™ box 10 
may be RF or any equivalent mode of transmission. 

40 

III. Server System 

[0017] The WebTV™ server 5 generally includes one 
or more computer systems generally having the archi- 
es tecture illustrated in Figure 3. It should be noted that the 
illustrated architecture is only exemplary; the present in- 
vention is not constrained to this particular architecture. 
The illustrated architecture includes a central process- 
ing unit (CPU) 50, random access memory (RAM) 51, 
so read-only memory (ROM) 52, a mass storage device 53, 
a modem 54, a network interface card (NIC) 55, and var- 
ious other input/output (I/O) devices 56. Mass storage 
device 53 includes a magnetic, optical, or other equiva- 
lent storage medium. I/O devices 56 may include any or 
55 all of devices such as a display monitor, keyboard, cur- 
sor control device, etc.. Modem 54 is used to communi- 
cate data to and from remote servers 4 via the Internet. 
[0018] As noted above, the WebTV™ server 5 may 
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actually comprise multiple physical and logical devices 
connected in a distributed architecture. Accordingly, 
NIC 55 is used to provide data communication with other 
devices that are part of the WebTV™ services. Modem 
54 may also be used to communicate with other devices 5 
that are part of the WebTV™ services and which are not 
located in close geographic proximity to the illustrated 
device. 

[0019] According to the present invention, the 
WebTV™ server 5 acts as a proxy in providing the 
WebTV™ client 1 with access to the Web and other 
WebTV™ services. More specifically, WebTV™ server 
5 functions as a "caching proxy". Figure 4A illustrates 
the caching feature of the WebTV™ server 5. In Figure 
4A, the WebTV™ server 5 is functionally located be- 
tween the WebTV™ client 1 and the Internet infrastruc- 
ture 3. The WebTV™ server 5 includes a proxy cache 
65 which is functionally coupled to the WebTV™ client 
1. The proxy cache 65 is used for temporary storage of 
Web documents, images, and other information which 
is used by frequently either the WebTV™ client 1 or the 
WebTV™ server 5. 

[0020] A document transcoder 66 is functionally cou- 
pled between the proxy cache 65 and the Internet infra- 
structure 3. The document transcoder 66 includes soft- 
ware which is used to automatically revise the code of 
Web documents retrieved from the remote servers 4, for 
purposes which are described below. 
[0021] The WebTV™ service provides a document 
database 61 and a user database 62, as illustrated in 
Figure 4B. The user database 62 contains information 
that is used to control certain features relating to access 
privileges and capabilities of the user of the client 1 . This 
information is used to regulate initial access to the 
WebTV™ service, as well as to regulate access to the 
individual services provided by the WebTV™ system, 
as will be described below. The document database 61 
is a persistent database which stores certain diagnostic 
and historical information about each document and im- 
age retrieved by the server 5, as is now described. 

A. Document Database 

[0022] The basic purpose of the document database 
61 is that, after a document has once been retrieved by 
the server 5, the stored information can be used by the 
server 5 to speed up processing and downloading of that 
document in response to all future requests for that doc- 
ument. In addition, the transcoding functions and vari- 
ous other functions of the WebTV™ service are facili- 
tated by making use of the information stored in the doc- 
ument database 61 , as will be described below. 
[0023] Referring now to Figure 5, the server 5 initially 
receives a document request from a client 1 (step 501). 
The document request will generally result from the user 
of the client 1 activating a hypertext anchor (link) on a 
Web page. The act of activating a hypertext anchor may 
consist of clicking on underlined text in a displayed Web 



page using a mouse, for example. The document re- 
quest will typically (but not always) include the URL (Uni- 
form Resource Locator) or other address of the selected 
anchor. Upon receiving the document request, the serv- 
er 5 optionally accesses the document database 62 to 
retrieve stored information relating to the requested doc- 
ument (step 502). It should be noted that the document 
database 62 is not necessarily accessed in every case. 
The information retrieved from the document database 
62 is used by the server 5 for determining, among other 
things, how long a requested document has been 
cached and/or whether the document is still valid. The 
criteria for determining validity of the stored document 
are discussed below. 

[0024] The server 5 retrieves the document from the 
cache 65 if the stored document is valid; otherwise, the 
server 5 retrieves the document from the appropriate re- 
mote server 4 (step 503). The server 5 automatically 
transcodes the document as necessary based on the 
information stored in the document database 61 (step 
503). The transcoding functions are discussed further 
below. 

[0025] The document database 61 includes certain 
historical and diagnostic information for every Web page 
that is accessed at any time by a WebTV™ client 1. As 
is well known, a Web page may correspond to a docu- 
ment written in a language such as HTML (Hypertext 
Mark-Up Language), VRML (Virtual Reality Modelling 
Language), or another suitable language. Alternatively, 
a Web page may represent an image, or a document 
which references one or more images. According to the 
present invention, once a document or image is re- 
trieved by the WebTV™ server 5 from a remote server 
4 for the first time, detailed information on this document 
or image is stored permanently in the document data- 
base 61. More specifically, for every Web page that is 
retrieved from a remote server 4, any or all of the follow- 
ing data are stored in the document database 61: 

1) information identifying bugs (errors) or quirks in 
the Web page, or undesirable effects caused when 
the Web page is displayed by a client 1 ; 

2) relevant bug-finding algorithms; 

3) the date and time the Web page was last re- 
trieved; 

4) the date and time the Web page was most re- 
cently altered by the author; 

5) a checksum for determining whether the Web 
page has been altered; 

6) the size of the Web page (in terms of memory); 

7) the type of Web page (e.g., HTML document, im- 
age, etc.); 

8) a list of hypertext anchors (links) in the Web page 
and corresponding URLs; 

9) a list of the most popular anchors based on the 
number of "hits" (requests from a client 1); 

10) a list of related Web pages which can be 
prefetched 
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11) whether the Web page has been redirected to 
another remote server 4; 

12) a redirect address (if appropriate); 

1 3) whether the redirect (if any) is temporary or per- 
manent, and if permanent, the duration of the redi- 
rect; 

14) if the Web page is an image, the size of the im- 
age in terms of both physical dimensions and mem- 
ory space; 

1 5) the sizes of in-line images (images displayed in 
text) referenced by the document defining the Web 
page; 

16) the size of the largest image referenced by the 
document; 

17) information identifying any image maps in the 
Web page; 

18) whether to resize any images corresponding to 
the Web page; 

19) an indication of any forms or tables in the Web 
page; 

20) any unknown protocols; 

21) any links to "dead" Web pages (i.e., pages 
which are no longer active); 

22) the latency and throughput of the remote server 
4 on which the Web page is located; 

23) the character set of the document; 

24) the vendor of the remote server 4 on which the 
Web page is located; 

25) the geographic location of the remote server 4 
on which the Web page is located; 

26) the number of other Web pages which reference 
the subject Web page; 

27) the compression algorithm used by the image 
or document; 

28) the compression algorithm chosen by the trans- 
coder; 

29) a value indicating the popularity of the Web 
page based on the number of hits by clients; and 

30) a value indicating the popularity of other Web 
pages which reference the subject Web page. 

B. Transcoding 

[0026] As mentioned above, the WebTV™ services 
provide a transcoder 66, which is used to rewrite certain 
portions of the code in an HTML document for various 
purposes. These purposes include: (1) correcting bugs 
in documents; (2) correcting undesirable effects which 
occur when a document is displayed by the client 1; (3) 
improving the efficiency of transmission of documents 
from the server 5 to the client 1 ; (4) matching hardware 
decompression technology within the client 1; (5) resiz- 
ing images to fit on the television set 12; (6) converting 
documents into other formats to provide compatibility; 
(7) reducing latency experienced by a client 1 when dis- 
playing a Web page with in-line images (images dis- 
played in text); and, (8) altering documents to fit into 
smaller memory spaces. 



[0027] There are three transcoding modes used by 
the transcoder 66: (1) streaming, (2) buffered, and (3) 
deferred. Streaming transcoding refers to the transcod- 
ing of documents on a line-by-line basis as they are re- 
s trieved from a remote server 4 and downloaded to the 
client 1 (i.e., transcoding "on the fly"). Some documents, 
however, must first be buffered in the WebTV™ server 
5 before transcoding and downloading them to the client 
1 . A document may need to be buffered before trans- 
it* mitting it to the client 1 if the type of changes to be made 
can only be made after the entire document has been 
retrieved from the remote server 4. Because the process 
of retrieving and downloading a document to the client 
1 increases latency and decreases throughput, it is not 
15 desirable to buffer all documents. Therefore, the trans- 
coder 66 accesses and uses information in the docu- 
ment database 61 relating to the requested document 
to first determine whether a requested document must 
be buffered for purposes of transcoding, before the doc- 
20 ument is retrieved from the remote server 4. 

[0028] In the deferred mode, transcoding is deferred 
until after a requested document has been downloaded 
to a client 1 . The deferred mode therefore reduces la- 
tency experienced by the client 1 in receiving the docu- 
25 ment. Transcoding may be performed immediately after 
downloading or any time thereafter: For example, it may 
be convenient to perform transcoding during periods of 
low usage of WebTV™ services, such as at night. This 
mode is useful for certain types of transcoding which are 
30 not mandatory. 

1 . Transcoding for Bugs and Quirks 

[0029] One characteristic of some prior art Web 

35 browsers is that they may experience failures ("crash- 
es") because of bugs or unexpected features ("quirks") 
that are present in a Web document. Alternatively, quirks 
in a document may cause an undesirable result, even 
though the client does not crash. Therefore, the trans- 

40 coding feature of the present invention provides a 
means for correcting certain bugs and quirks in a Web 
document. To be corrected by the transcoder 66, bugs 
and quirks must be identifiable by software running on 
the server 5. Consequently, the transcoder 66 will gen- 

45 erally only correct conditions which have been previous- 
ly discovered, such as those discovered during testing 
or reported by users. Once a bug or quirk is discovered, 
however, algorithms are added to the transcoder 66 to 
both detect the bug or quirk in the future in any Web 

so document and to automatically correct it. 

[0030] There are countless possibilities of bugs or 
quirks which might be encountered in a Web document. 
Therefore, no attempt will be made herein to provide an 
exhaustive list. Nonetheless, some examples may be 

55 . useful at this point. Consider, for example, an HTML 
document that is downloaded from a remote server 4 
and which contains a table having a width specified in 
the document as "0." This condition might cause a fail- 
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ure if the client were to attempt to display the document 
as written. This situation therefore, can be detected and 
corrected by the transcoder 66. Another example is a 
quirk in the document which causes quotations to be ter- 
minated with too many quotation marks. Once the quirk 
is first detected and an algorithm is written to recognize 
it, the transcoder 66 can automatically correct the quirk 
in any document. 

[0031] If a given Web document has previously been 
retrieved by the server 5, there will be information re- 
garding that document available in the document data- 
base 61 as described above. The information regarding 
this document will include whether or not the document 
included any bugs or quirks that required transcoding 
when the document was previously retrieved. The trans- 
coder 66 utilizes this information to determine whether 
(1) the document is free of bugs and quirks, (2) the doc- 
ument has bugs or quirks which can be remedied by 
transcoding on the fly, or (3) the document has bugs or 
quirks which cannot be corrected on the fly (i.e., buffer- 
ing is required). 

[0032] Figure 6 illustrates a routine for transcoding a 
Web document for purposes of eliminating bugs and 
quirks. Initially, the server 5 receives a document re- 
quest from the client 1 (step 601). Next, the document 
database 61 is accessed to determine whether or not 
the requested document has been previously retrieved 
(step 602). If the document has not been previously re- 
trieved, then the server 5 retrieves the document from 
the remote server 4 (step 609). Next, the retrieved doc- 
ument is analyzed for the presence of bugs or unusual 
conditions (step 610). Various diagnostic information is 
then stored in the document database 61 as a result of 
the analysis to note any bugs or quirks that were found 
(step 611). If any bugs or quirks were found which can 
be corrected by the transcoder 66, the document is then 
transcoded and saved to the proxy cache 65 (step 612). 
The transcoded document is then downloaded to the cli- 
ent 1 (step 61 3). It should be noted that transcoding can 
be deferred until after the document has been down- 
loaded, as described above; hence, the sequence of 
Figure 6 is illustrative only. 

[0033] If (in step 602) the requested document had 
been previously retrieved, then it is determined whether 
the requested document is still valid (step 603) and 
whether the document is present in the proxy cache 65 
(step 604). If the document is no longer valid, then the 
document is retrieved from the remote server 4, ana- 
lyzed for bugs and quirks, transcoded as required, and 
then downloaded to the client 1 as described above 
(steps 610-613, step 607). Methods for determining va- 
lidity of a document are discussed below. If the docu- 
ment is still valid (step 603) and the document is present 
in the cache 65, the document is downloaded to the cli- 
ent 1 in its current form (as it is stored in the cache), 
since it has already been transcoded (step 608). 
[0034] The document, however, may be valid but not 
present in the cache. This may be the case, for example, 



if the document has not been requested recently and 
the cache 65 has become too full to retain the requested 
document. In that case, the document is retrieved again 
from the remote server 4 (step 605) and then transcoded 
5 on the basis of the previously-acquired diagnostic infor- 
mation stored within the database 61 for that document. 
The document is then saved to the cache 65 (step 606). 
Note that because the document is still valid, it is as- 
sumed that the diagnostic information stored in the doc- 
10 ument database 61 for that document is still valid and 
that the transcoding can be performed on the basis of 
that information. Accordingly, once the document is 
transcoded, the transcoded document is downloaded to 
the client 1 (step 607). Again, note that transcoding can 
15 be deferred until after the document has been down- 
loaded in some cases. 

[0035] The validity of the requested document can be 
determined based on various different criteria. For ex- 
ample, some HTML documents specify a date on which 
20 the document was created, a length of time for which 
the document will be valid, or both. The validity deter- 
mination can be based upon such information. For ex- 
ample, a document which specifies only the date of cre- 
ation can be automatically deemed invalid after a pre- 
25 determined period of time has passed. 

[0036] Alternatively, validity can be based upon the 
popularity of the requested document. "Popularity" can 
be quantified based upon the num ber of hits for that doc- 
ument, which is tracked in the document database 61. 
30 For example, it might be prudent to simply assign a rel- 
atively short period of validity to a document which is 
very popular and a longer period of validity to a docu- 
ment which is less popular. 

[0037] Another alternative basis for the validity of a 
35 document is the observed rate of change of the docu- 
ment. Again, data in the persistent document database 
61 can be used. That is, because the document data- 
base 61 stores the date and time on which the document 
was last observed to change, the server 5 can approx- 
40 imate how often the document actually changes. A doc- 
ument or image which is observed to change frequently 
(e.g., a weather map or a news page) can be assigned 
a relatively short period of validity. It will be recognized 
that numerous other ways of determining validity are 
45 possible. 

2. Transcoding to Reduce Latency 

[0038] Another purpose for transcoding, which does 
so not form part of the invention as claimed, is to allow doc- 
uments requested by a client 1 to be displayed by the 
client 1 more rapidly. Many HTML documents contain 
references to "in-line" images, or images that will be dis- 
played in text in a Web page. The normal process used 
55 in the prior art to display a Web page having in-line im- 
ages is that the HTML document referencing the image 
is first downloaded to the client, followed by the client's 
requesting the referenced image. The referenced image 
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is then retrieved from the remote server on which it is 
located and downloaded to the client. The speed with 
which a complete Web page can be displayed to the us- 
er is often limited by the time it takes to retrieve in-line 
images. One reason for this is that it simply takes time 
to retrieve the image itself after the referencing docu- 
ment has been retrieved. Another reason is that, if the 
referencing document does not specify the size of the 
image, the Web page generally cannot be displayed un- 
til the image itself has been retrieved. 
[0039] It is known that information stored in the doc- 
ument database 61 regarding the in-line images is used 
to transcode the referencing document in order to re- 
duce latency in displaying the Web page. Once any doc- 
ument which references an in-line image is initially re- 
trieved by the server 5, the fact that the document ref- 
erences an in-line image is stored in the document da- 
tabase 61. In addition, the size of the image is deter- 
mined, either from the document (if specified) or from 
the image itself, and then stored in the document data- 
base 61. Consequently, for documents which do not 
specify the size of their in-line images, the size informa- 
tion stored in the database 61 is then used the next time 
the document is requested in order to reduce latency in 
downloading and displaying the Web page. 
[0040] Refer now to Figure 7, which illustrates a rou- 
tine, known from prior art for reducing latency when 
downloading a document referencing an image to a cli- 
ent 1 . Assume that a client 1 sends a request to the serv- 
er 5 for an HTML document containing a reference to 
an in-line image. Assume further that the size of the im- 
age is not specified in the document itself. Initially, the 
server 5 determines whether that document has been 
previously retrieved (step 701 ). If not, the standard initial 
retrieval and transcoding procedure is followed (step 
706), as described in connection with Figure 6. If, how- 
ever, the document has been previously retrieved, then 
the transcoder 66 accesses the size information stored 
in the document database 61 for the in-line image (step 
702). Based on this size information, the HTML docu- 
ment is transcoded such that, when the Web page is 
initially displayed by the client 1 , the area in which the 
image belongs is replaced by a blank region enveloping 
the shape of the image. Thus, any in-line image refer- 
enced by a document is displayed initially as a blank 
region. Consequently, the client 1 can immediately dis- 
play the Web page corresponding to the HTML docu- 
ment even before the referenced image has been re- 
trieved or downloaded (i.e., even before the size of the 
image is known to the client 1). 
[0041] As the transcoded HTML document is down- 
loaded to the client, the image is retrieved from the ap- 
propriate remote server 4 (step 704). Once the image is 
retrieved from the remote server 4 and downloaded to 
the client 1 , the client 1 replaces the blank area in the 
Web page with the actual image (step 705). 



3. Transcoding to Display Web Pages on a Television 

[0042] As noted above, the client 1 utilizes an ordinary 
television set 12 as a display device. However, images 

5 in Web pages are generally formatted for display on a 
computer monitor, not a television set. Consequently, a 
known transcoding function is used to resize images for 
display on the television set 12. This includes rescaling 
images as necessary to avoid truncation when dis- 

10 played on the television set 12. 

[0043] It should be noted that prior art Web browsers 
which operate on computer monitors typically use resiz- 
able windows. Hence, the size of the visible region var- 
ies from client to client. However, the web browser used 

15 by the WebTV™ client 1 is specifically designed for dis- 
play on a television set. 

4. Transcoding for Transmission Efficiency 

20 [0044] Documents retrieved by the server 5 are also 
transcoded by using methods known from prior art to 
improve transmission efficiency. In particular, docu- 
ments can be transcoded in order to reduce high fre- 
quency components in order to reduce interlace flicker 

25 when they are displayed on a television set. 

[0045] Documents can also be transcoded in order to 
lower the resolution of the displayed Web page. Reduc- 
ing the resolution is desirable, because images format- 
ted for computer systems will generally have a higher 

30 resolution than the NTSC (National Television Stand- 
ards Committee) video format used by conventional tel- 
evision sets. Since the NTSC video does not have the 
bandwidth to reproduce the resolution of computer-for- 
matted images, the bandwidth consumed in transmitting 

35 images to the client 1 at such a high resolution would 
be wasted. 

5. Other Uses for Transcoding 

40 [0046] In prior art, transcoding may also be used to 
recode a document using new formats into older, com- 
patible formats. Images are often displayed in the JPEG 
(Joint Picture Experts Group) format or the GIF image 
format. JPEG often consumes less bandwidth than GIF, 

<5 however. Consequently, images which are retrieved in 
GIF format are sometimes transcoded into JPEG for- 
mat. Methods for generally converting images between 
GIF and JPEG formats are well known. 
[0047] Other uses for transcoding include transcoding 

50 audio files. For example, audio may be transcoded into 
different formats in order to achieve a desired balance 
between memory usage, sound quality, and data trans- 
fer rate. In addition, audio may be transcoded from a file 
format (e.g., an ".ALT file) to a streaming format (e.g., 

55 MPEG 1 audio). Yet another use of audio transcoding is 
the transcoding of MIDI (Musical Instrument Digital In- 
terface) data to streaming variants of MIDI. 
[0048] Additionally, documents or images requiring a 
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large amount of memory (e.g., long lists) can be trans- 
coded in order to consume less memory space in the 
client 1 . This may involve, for example, separating a 
large document or image into multiple sections. For ex- 
ample, a server can insert tags at appropriate locations 
in the original document so that the document appears 
to a client as multiple Web pages. Hence, while viewing 
a given page representing a portion of the original doc- 
ument, the user can view the next page (i.e., the next 
portion of the original document) by activating a button 
on the screen as if it were an ordinary hypertext anchor. 

C. Proxying 

[0049] As noted above, the server 5 functions as a 
proxy on behalf of the client 1 for purposes of accessing 
the Web. The document database 61 is used in various 
ways to facilitate this proxy role, as will now be de- 
scribed. 

1 . Updating Cached Documents 

[0050] It is desirable to store frequently-requested 
HTML documents and images in the proxy cache 65 to 
further reduce latency in providing Web pages to the cli- 
ent 1 . However, because some documents and images 
change over time, documents in the cache 65 will not 
be valid indefinitely, as mentioned above. A weather 
map or a news-related Web page, for example, are likely 
to be updated quite frequently. Consequently, it is desir- 
able for the server 5 to have the ability to estimate the 
frequency with which documents change, in order to de- 
termine how long a document can safely remain within 
the proxy cache 65 without being updated. 
[0051 ] The persistent database 65 is used to store the 
date and time of the last several fetches of each docu- 
ment and image retrieved from a remote server 4, along 
with an indication of any changes that were detected, if 
any. A document or image which has been stored in the 
cache 65 is then retrieved on a periodic basis to deter- 
mine if it has been changed. Change status information 
indicating whether the document has changed since the 
previous fetch is then stored in the document database 
65. If no changes are detected, then the time interval 
between fetches of this document is increased. If the 
document has changed, the time interval is maintained 
or decreased. As a result, items in the cache 65 which 
change frequently will be automatically updated at fre- 
quent intervals, whereas documents which do not 
change often will be replaced in the cache less frequent- 
ly. 

[0052] Figure 8 illustrates a routine for updating doc- 
uments stored in the proxy cache 65 using data stored 
in the document database 61. Assume a document X 
has been stored in the proxy cache 65. Document X re- 
mains in the cache 65 until a predetermined update pe- 
riod T 1 expires (step 801). Upon the expiration of the 
update period T 1( the document X is again retrieved 



from the appropriate remote server 4 (step 802). The 
newly-retrieved document X is then compared to the 
cached version of document X (step 803). If the docu- 
ment has changed, then the cached version of docu- 

5 ment X is replaced with the newly-retrieved version of 
document X (step 806). If not, then the update period T t 
is increased according to a predetermined time incre- 
ment At 1 (step 804). In any case, the date and time and 
the change status of document X is saved to the docu- 

10 ment database 61 (step 805). 

Document and Image Prefetching 

[0053] It is known that a document database may also 
15 be used by a server to store prefetching information re- 
lating to documents and images. In particular, the data- 
base may store, for each document that has been re- 
trieved, a list of images referenced by the document, if 
any, and their locations. Consequently, the next time a 
20 document is requested by a client, the images can be 
immediately retrieved by a server (from a cache, if avail- 
able, or from a remote server), even before the client 
requests them. This procedure improves the speed with 
which requested Web pages are downloaded to the cli- 
25 ent. 

[0054] A document database may also be used to fa- 
cilitate a process referred to as "server-advised client 
prefetching." Server-advised client prefetching allows 
the server to inform the client of documents or images 
30 which are popular to allow the client to perform the 
prefetching. In particular, for any given document, a list 
is maintained in the server of the most popular hypertext 
anchors in that document (i.e., those which have previ- 
ously received a large number of hits). When that doc- 
35 ument is requested by the client, the server provides the 
client with an indication of these popular links. 

3. Redirects 

40 [0055] Web pages are sometimes forwarded from the 
remote server on which they are initially placed to a dif- 
ferent location. Under the HTTP (Hypertext Transport 
Protocol), such forwarding is sometimes referred to as 
a "redirect." When an HTML document is initially stored 

45 on one remote server and then later transferred to an- 
other remote server, the first remote server will provide, 
in response to a request for that document, an indication 
that the document has been transferred to a new remote 
server. This indication generally includes a forwarding 

so address ("redirect address"), which is generally a URL. 
[0056] In the prior art, when a computer requesting a 
Web page receives a redirect, it must then submit a new 
request to the redirect address. Having to submit a sec- 
ond request and wait for a second response consumes 

55 time and increases overall latency. Consequently, the 
present invention uses the document database 61 to 
store any redirect address for each document or image. 
Any time a redirected document is requested, the server 



25 



30 



8 



15 



EP0 811 939 B1 



16 



5 automatically accesses the redirect address to re- 
trieve the document The document or image is provided 
to the client 1 based on only a single request from the 
client 1. The change in location of the redirected docu- 
ment or image remains completely transparent to the 
client 1. 

[0057] Figure 9 illustrates a routine performed by the 
server 5 in accessing documents which may have been 
forwarded to a new remote server. Initially, the server 5 
receives a request for a document, which generally in- 
cludes an address (step 901 ). The server 5 then access- 
es the document database 65 to determine whether 
there is a redirect address for the requested document 
(step 902). If there is no redirect address, then the server 
5 accesses a remote server 4 based on the address pro- 
vided in the document request from the client 1 (step 
903). Assuming that the remote server 4 does not re- 
spond to the server 5 with a redirect (step 904), the doc- 
ument is retrieved and downloaded to the client 1 by the 
server 5 (step 907). If, however, a redirect address was 
stored in the document database 65 (step 902), then the 
server 5 accesses the requested document according 
to the redirect address (step 906). Or, if the remote serv- 
er 4 responded with a redirect (step 904), then the server 
5 saves the redirect address to the document database 
61 (step 905) and accesses the requested document ac- 
cording to the redirect address (step 906). 

4. Other Proxy Functions 

[0058] The document database 65 also stores infor- 
mation relating to the performance of each remote serv- 
er 4 from which a document is retrieved. This informa- 
tion includes the latency and throughput of the remote 
server 4. Such information can be valuable in instances 
where a remote server 4 has a history of responding 
slowly. For example, when the document is requested, 
this knowledge can be used by the server 5 to provide 
a predefined signal to the client 1. The client 1 can, in 
response to the signal, indicate to the user that a delay 
is likely and give the user the option of canceling the 
request. 

5. Backoff Mode 

[0059] Although the server 5 generally operates in the 
proxy mode, it is known that it can also enter a "backoff 
mode" in which the server 5 does not act as a proxy, or 
the server 5 performs only certain aspects of the normal 
proxying functions. For example, if the proxy cache 65 
is overloaded, then the server 5 can enter a backoff 
mode in which documents are not cached but are trans- 
coded as required. Alternatively, during times when the 
server 5 is severely overloaded with network traffic, the 
server 5 may instruct the client 1 to bypass the server 5 
and contact remote servers 4 directly for a specified time 
or until further notice. Or, the server 5 can enter a flexible 
backoff mode in which the client 1 will be instructed to 



contact a remote server 4 directly only for certain Web 
sites for a limited period of time. 

D. Access to WebTV™ Services 

5 

[0060] The WebTV™ server 5 provides various serv- 
ices to the client 1 , such as proxying and electronic mail 
("e-mail"). In the prior art, certain difficulties are associ- 
ated with allowing a client computer access to different 
w services of an Internet service, as will now be explained 
with reference to Figure 10. 

[0061] Figure 10 illustrates a client-server system ac- 
cording to one prior art embodiment The server 76 pro- 
vides various services A, B, and C. The server 76 in- 

15 eludes a database 71 for storing information on the us- 
er's access privileges to services A, B, and C. The client 
75 of the embodiment of Figure 1 0 accesses any of serv- 
ices A, B, and C by contacting that service directly. The 
contacted service then accesses the database 71, 

20 which stores the access privileges of the client 75, to 
determine whether the client 75 should be allowed to 
access that service. Hence, each service provided by 
the server 76 requires direct access to the database 71 . 
This architecture results in a large number of accesses 

25 being made to the database 71, which is undesirable. 
In addition, the fact that each service independently has 
access to the database 71 raises security concerns. 
Specifically, it can be difficult to isolate sensitive user 
information. The following example, not forming part of 

30 the invention, overcomes such difficulties using a tech- 
nique which is now described. 

1. Tickets Containing Privileges And Capabilities 

35 [0062] As shown as an example in Figure 11, the serv- 
er 5 provides a number of services D, E, and F, and a 
log-in service 78. The log-in service is used specifically 
to control initial log-on procedures by a client 1 . The log- 
in service 78 has exclusive access to the user database 
to 62 (discussed above with respect to Figure 4B). The log- 
in service 78 and the user database 62 are located with- 
in a first security zone 84. Service D is located within a 
second security zone 86, while services E and F are 
contained within a third security zone 88. Note that the 
45 specific arrangement of security zones 84, 86, and 88 
with respect to services D, E, and F is illustrative only. 
[0063] The user database 66 stores various informa- 
tion pertaining to each authorized user of a client 1 . This 
information includes account information, a list of the 
so WebTV™ that services are available to the particular us- 
er, and certain user preferences. For example, a partic- 
ular user may not wish his client 1 to be used to access 
Web pages having adult-oriented subject matter. Con- 
sequently, the user would request that his account be 
55 filtered to prevent access to such material. This request 
would then be stored as part of the user data in the user 
database 66. 

[0064] With regard to user preferences, the hypertext 
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links selected by a given user can be tracked, and those 
having the largest number can be stored in the user da- 
tabase 66. The list can then be provided to the client 1 
for use in generating a menu screen of the user's fa- 
vorite Web sites, to allow the user to directly access 
those Web sites. The list can also be used by the server 
5 to analyze the user's interests and to formulate and 
provide to the user a list of new Web sites which the user 
is likely to be interested in. The list might be composed 
by associated key words in Web pages selected by the 
user with other Web pages. 

[0065] Referring again to the example shown in Fig- 
ure 11, in response to a log-on request by a client 1 , the 
log-in service 78 consults the user database 62 to de- 
termine if access to the server 5 by this particular client 
1 is authorized. Assuming access is authorized, the log- 
in service 78 retrieves certain user information pertain- 
ing to this particular client 1 from the user database 62. 
The log-in service then generates a "ticket" 82, which is 
an information packet including the retrieved informa- 
tion. The ticket 82 is then provided to the client 1 which 
requested access. 

[0066] The ticket 82 includes all information neces- 
sary to describe the access privileges of a particular us- 
er with respect to all services provided by the server 5. 
For example, the ticket may include the user name reg- 
istered to the client 1 , the e-mail address assigned to 
client 1, and any filtering requested by the user with re- 
spect to viewing Web sites. Each time the user requests 
access to one of the services D, E, or F, the client 1 sub- 
mits a copy of the ticket 82 to that service. The requested 
service can then determine from the copy of the ticket 
82 whether access to that service by that client 1 is au- 
thorized and, if so, any important information relating to 
such access. 

[0067] None of the services provided by the server 5, 
other than the log-in service 78, has access to the user 
database 62. Hence, any security-sensitive information 
can be isolated within the user database 62 and the log- 
in service 78. Such isolation allows the individual serv- 
ices provided by the server 5 to be placed within sepa- 
rate "firewalls" (security regions), illustrated as security 
zones 84, 86, and 88. In addition, this technique greatly 
reduces the number of accesses required to the user 
database 62 compared to the prior art embodiment il- 
lustrated in Figure 10. 

2. Redundancy of Services and Load Balancing 

[0068] It is also known to include certain redundan- 
cies in the various services provided by the server 5. In 
particular, a given service (e.g., e-mail) can be provided 
by more than one physical or logical device. Each such 
device is considered a "provider" of that service. If a giv- 
en provider is overloaded, or if the client 1 is unable to 
contact that provider, the client 1 can contact any of the 
other providers of that service. When the server 5 re- 
ceives a log-in request from a client 1 , in addition to gen- 



erating the above-described ticket 82, the log-in service 
78 dynamically generates a list of available WebTV™ 
services and provides this list to the client 1 . 
[0069] It is known that the server 5 can update the list 

5 of services used by any client 1 to reflect services be- 
coming unavailable or services coming on-line. Also, the 
list of services provided to each client 1 can be updated 
by the server 5 based upon changes in the loading of 
the server 5, in order to optimize traffic on the server 5. 

10 in addition, a client's list of services can be updated by 
services other than the log-in service 78, such that one 
service can effectively introduce another service to the 
client 1. For example, the e-mail service may provide a 
client 1 with the name, port number and IP of its address 

15 book service. Thus, one service can effectively, and se- 
curely within the same chain of trust, introduce another 
service to the client 1 . 

[0070] This list of services includes the name of each 
service, a port number for the provider of each service, 

20 and an IP (Internet Protocol) for each service. Different 
providers of the same service are designated by the 
same name, but different port numbers and/or IPs. Note 
that in a standard URL, the protocol is normally specified 
at the beginning of the URL, such as "HTTP://www...." 

25 under the HTTP protocol. However, for example, the 
normal protocol designation (i.e., "HTTP") in the URL 
may be replaced with the name of the service, since the 
port number and IP for each service are known to the 
client 2. Hence, the client 1 can access any of the re- 

30 dundant providers of a given service using the same 
URL. This procedure effectively may add a level of indi- 
rection to all accesses made to any WebTV™ service 
and automatically adds redundancy to the proxy serv- 
ice. It should also be noted that separate service names 

35 can also refer to the same service. 

[0071] Assume, for example, that the e-mail service 
provided by the WebTV™ system is designated by the 
service name "WTV-mailto." A client 1 can access any 
provider of this e-mail service using the same URL. The 

<o client 1 merely chooses the appropriate port number 
and IP number to distinguish between providers. If the 
client 1 is unable to connect to one e-mail provider, it 
can simply contact the next one in the list. 
[0072] Thus, for example, and not forming part of the 

45 invention, at log-in time, a client 1 is provided with both 
a ticket containing privileges and capabilities as well as 
a list of service providers, as illustrated in Figure 12. In- 
itially, the log-in service 78 determines whether the user 
of client 1 is a valid user (step 1201). If not, log-in is 

so denied (step 1205), If the user is a valid user, then the 
log-in service 78 gathers user information from the user 
database 62 and generates a ticket 82 (step 1202). The 
log-in service 78 also generates the above-described 
list of services (step 1203). The ticket 82 and the list of 

55 services are then downloaded to the client 1 (step 1 204). 
[0073] As another example, if e-mail addressed to the 
user has been received by the server 5, then the server 
5 will send a message to the client 1 indicating this fact. 
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The client 1 will then notify the user that e-mail is waiting 
by a message displayed on the television set 12 or by 
an LED (light emitting diode) built into the housing of 
WebTV™ box 10. 

[0074] Thus, a method and apparatus have been de- 5 
scribed for providing proxying and transcoding of docu- 
ments in a network. Although the present invention has 
been described with reference to specific exemplary 
embodiments, it will be evident that various modifica- 
tions and changes may be made to these embodiments, w 



Claims 

1. A method of providing proxy services to a client for is 
accessing a document stored in a remote server, in 

a computer network that includes a proxying server 
coupled to a client and to a remote server, the doc- 
ument including data to be used by the client to pro- 
vide a display, the client submitting a request to the 20 
proxy server for a document stored in a remote 
server and the proxy server retrieving the document 
from the remote server, the method characterised 
by the steps of: 

25 

submitting a request from the client to the prox- 
ying server for a document; 
providing a persistent database at the proxying 
server, the persistent database including infor- 
mation relating to the document and corre- 30 
sponding to a plurality of error conditions; 
using the information included in the persistent 
database to transcode at the proxying server 
the data in the document in order to perform the 
function of correcting the error conditions in the 35 
document by performing the steps of: 

analyzing the data in the document using 
the information corresponding to the plural- 
ity of error conditions to determine whether 40 
the data is likely to cause one of the plural- 
ity of error conditions to occur when used 
by the client; and 

automatically revising the data if the data 
is determined in the analyzing step to be 45 
likely to cause one of the plurality of error 
conditions to occur when used by the cli- 
ent; and 

transmitting the transcoded document to 
the client. 50 

2. A method according to claim 1, further comprising 
the step of storing in the persistent database validity 
information corresponding to the document. 

55 

3. A method according to claim 3, wherein the validity 
information is based on an observed rate of change 
of the document. 



4. A method according to claim 1, further comprising 
the step of storing in the persistent database per- 
formance information relating to performance of the 
remote server when accessing the document. 

5. A method according to claim 4, wherein the per- 
formance information is a latency value. 

6. A method according to claim 1 , further comprising 
the step of storing in the persistent database infor- 
mation for optimizing memory usage by the client. 



Patentanspruche 

1. Ein Verfahren zum Bereitstellen von Proxy-Dien- 
sten fur einen Client, urn auf ein bei einem fernen 
Server gespeichertes Dokument zuzugreifen, in ei- 
nem Computernetzwerk, das einen mit einem Client 
und einem fernen Server gekoppelten Proxy-Ser- 
ver enthait, wobei das Dokument von dem Client 
zum Bereitstellen einer Anzeige zu verwendende 
Daten enthait, wobei der Client dem Proxy-Server 
eine Anforderung fur ein bei einem fernen Server 
gespeichertes Dokument vorlegt, und der Proxy- 
Server das Dokument von dem fernen Server ab- 
ruft, 

wobei das Verfahren durch die Schritte ge- 
kennzeichnet ist: 

daft dem Proxy-Server von dem Client eine An- 
forderung fur ein Dokument vorgelegt wird; 
daft bei dem Proxy-Server eine persistente Da- 
tenbank bereitgestellt wird, wobei die persi- 
stente Datenbank das Dokument betreffende 
und einer Mehrzahl von Fehlerbedingungen 
zugeordnete Informationen enthait; 
daft die in der persistenten Datenbank enthal- 
tenen Informationen verwendetwerden, urn bei 
dem Proxy-Server die Daten in dem Dokument 
umzucodieren, urn die Funktion des Korrigie- 
rens der Fehlerbedingungen in dem Dokument 
auszufuhren, indem die Schritte ausgefuhrt 
werden: 

daft die Daten in dem Dokument unterVer- 
wendung der der Mehrzahl der Fehlerbe- 
dingungen zugeordneten Informationen 
analysiert werden, um zu bestimmen, ob 
die Daten bei der Verwendung durch den 
Client wahrscheinlich das Auftreten einer 
der Mehrzahl der Fehlerbedingungen ver- 
ursachen wurden; und 
daft die Daten automatisch revidiert wer- 
den, wenn in dem Analysierschritt be- 
stimmt wird, daft die Daten bei der Verwen- 
dung durch den Client wahrscheinlich das 
Auftreten einer der Mehrzahl der Fehlerbe- 
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dingungen verursachen wurden; und 
dad das umcodierte Dokument an den Cli- 
ent gesendet wird. 

2. Ein Verfahren nach Anspruch 1, ferner mit dem 
Schritt, dafi in der persistenten Datenbank dem Do- 
kument zugeordnete Gultigkeitsinformationen ge- 
speichert werden. 

3. Ein Verfahren nach Anspruch 2, wobei die Gultig- 
keitsinformationen auf einer beobachteten Ande- 
rungsrate des Dokuments basieren. 

4. Ein Verfahren nach Anspruch 1, ferner mit dem 
Schritt, daB in der persistenten Datenbank die Per- 
formance des fernen Servers beim Zugriff auf das 
Dokument betreffende Performance-lnformationen 
gespeichert werden. 

5. Ein Verfahren nach Anspruch 4, wobei die Perfor- 
mance-lnformationen ein Latenzwert sind. 

6. Ein Verfahren nach Anspruch 1, ferner mit dem 
Schritt, dad in der persistenten Datenbank Informa- 
tionen zum Optimieren der Speichernutzung durch 
den Client gespeichert werden. 



Revendications 

1 . Un procede pour fournir des services d' agent inter- 
mediate (proxy) a un client pour acceder a un do- 
cument stocke sur un serveur distant, dans un re- 
seau informatique qui comprend un serveur man- 
data ire couple a un client et a un serveur distant, le 
document comprenant des donn6es devant etre uti- 
lisees par le client pour fournir un affichage, le client 
soumettant une demande au serveur mandataire 
pour un document stock6 sur un serveur distant, et 
le serveur mandataire recup^rant le document de- 
puis le serveur distant, le procede etantcaracterise 
par les etapes consistant a : 

soumettre une demande du client au serveur 
mandataire pour un document ; 
fournir une base de donnees persistante au ni- 
veau du serveur mandataire, la base de don- 
nees persistante comprenant des informations 
concernant le document et correspondant a 
une plurality de conditions d'erreur ; 
utiliser les informations comprises dans la base 
de donnees persistante pour transcoder, au ni- 
veau du serveur mandataire, les donnees dans 
le document, a fin de realiser la fonction de cor- 
rection des conditions d'erreur dans le docu- 
ment en effectuant les etapes consistant a : 

analyser les donnees dans le document en 



utilisant les informations correspondant a 
la pluralite des conditions d'erreur, pour de- 
terminer si les donnees sont susceptibles, 
lorsque qu'etles sont utilisees par le client, 

5 de faire survenir une parmi la pluralite des 

conditions d'erreur ; et 
corriger automatiquement les donnees s'il 
est determine, au cours de I'etape d'analy- 
se, que les donnees sont susceptibles, 

10 lorsqu'elles sont utilisees par le client, de 

faire survenir une parmi la pluralite de con- 
ditions d'erreur, et 

transmettre au client le document transco- 
de. 

15 

2. Un procede selon la revendication 1, comprenant 
en outre I'etape consistant a stocker, dans la base 
de donnees persistante, des informations de validi- 
ty correspondant au document. 

20 

3. Un procede selon la revendication 2, dans lequel 
les informations de validite sont basees sur un taux 
observe de changement du document. 

25 4. Un procede selon la revendication 1, comprenant 
en outre I'etape consistant a stocker, dans la base 
de donnees persistante, des informations de per- 
formance concernant les performances du serveur 
distant lors de I'acces au document. 

30 

5. Un procede selon la revendication 4, dans lequel 
reformation de performance est une valeur de la- 
tence. 

35 6. Un procede selon la revendication 1, comprenant 
en outre I'etape consistant a stocker, dans la base 
de donnees persistante, des informations pour op- 
timiser I'utilisation de la memoire par le client. 

40 
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